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Abstract 

The gyrokinetic simulation code AstroGK is developed to study fundamental aspects of kinetic plasmas 
and for applications mainly to astrophysical problems. AstroGK is an Eulcrian slab code that solves the 
electromagnetic gyrokinetic-Maxwell equations in five-dimensional phase space, and is derived from the 
existing gyrokinetics code GS2 by removing magnetic geometry effects. Algorithms used in the code are 
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1. Introduction 



Gyrokinetics is a limit of kinetic theory that describes the low-frequency dynamics of weakly collisional 
plasmas in a mean magnetic field. Developed for the study of magnetically confined fusion plasmas, it 
has proven to be a valuable tool in understanding the dynamics of drift-wave turbulence, a key cause of 
the enhanced transport measured in modern fusion experiments that leads to poor device performance. 
Tremendous theoretical, computational, and experimental efforts have been devoted to this problem, with 
steady progress over three decades. 

It has been recently recognized that the gyrokinetic approach is also well-suited to the study of astro- 
physical plasmas, including galaxy clusters, accretion disks around compact objects, the interstellar medium, 
and the solar corona and solar wind [ll-Q . Taking advantage of the knowledge and computational techniques 
developed in the simulation of turbulence in fusion plasmas, AstroGK a gyrokinetic simulation code, is 
developed specifically for the study of astrophysical problems. In this paper, we describe the algorithms 
employed in AstroGK and present verification tests and performance results. 

Gyrokinetics describes the low-frequency fluctuations of magnetized plasmas by exploiting the timescale 
separation between the low-frequency dynamics of interest and the fast cyclotron motion of particles, uj <$:n, 
where to is the typical frequency of fluctuations and f2 is the cyclotron frequency. By averaging the kinetic 
Vlasov-Landau (or Boltzmann) equation and Maxwell's equations over the fast cyclotron motion, a self- 
consistent gyrokinetic-Maxwell (GK-M) system is defined in five-dimensional phase space. This system 
orders out the fast MHD waves and the cyclotron resonance, but retains finite Larmor radius (FLR) effects 
and collisionless wave-particle interactions via the Landau resonance 0, Q . 

The theoretical foundation of gyrokinetics has been developed extensively over the past four decades [l|,0- 
[TtI , and gyrokinetics is now broadly employed for numerical studies of turbulence driven by microinstabilities 
in laboratory plasmas. The reduction of the phase-space dimensionality and the relaxed timestep constraints 
under the gyrokinetic approximation have made possible gyrokinetic simulations of fusion devices, yet it is 
still computationally demanding. The simulation of the kinetic dynamics of magnetized plasmas in both 
laboratory and astrophysical settings presents a challenge to the scientific community requiring the most 
advanced computing technology available. 

A number of computational codes for gyrokinetics have been developed and are actively refined worldwide 



by the fusion community [18H26| . These codes are typically classified by the following characteristics: 



Eulerian continuum vs. Lagrangian particle (PIC), local flux tube vs. global toroidal, Sf vs. full—/, and 
electromagnetic vs. electrostatic. AstroGK has been derived from the Eulerian continuum, local flux tube, 
Sf, electromagnetic code GS2 [H, [2^. GS2 was designed to simulate plasma dynamics in fusion devices 
where the magnetic geometry plays a central role. The handling in the code of the toroidal magnetic 
geometry and of particle trapping requires a rather complicated numerical implementation. In contrast, the 
study of the fundamental properties of kinetic plasmas for application to astrophysical situations demands 
the simulation of the dynamics at scales of order the particle Larmor radii on which the magnetic field is 
well approximated as straight or gently curved, so the coding to handle complicated magnetic geometries 
is unnecessary. Therefore AstroGK was created by stripping GS2 of the cumbersome coding necessary to 
describe the magnetic geometry effects, leading to a simplified, faster code ideally suited to study weakly 
collisional astrophysical plasmas. In addition to this, we believe the code is also an ideal testbed for new 
ideas of additional physics, diagnostics, numerical algorithms, and optimizations which may be exported 
back to GS2. 

AstroGK has already proven its usefulness in a number of studies. For example, it has produced the first 
kinetic simulations of turbulence describing the transition from Alfven to kinetic Alfven wave turbulence 



at the scale of the ion Larmor radius in an attempt to understand solar wind turbulence 27|, revealed 
nonlinear phase- mixing properties of turbulence [2^ , enabled the study of the statistical properties of phase- 
space structures of plasma turbulence [2^, explored the transition from collisional to collisionless tearing 
instabilities [s^l , and described the Alfven wave dynamics in the LAPD experiment [sij . 

This paper is organized as follows: In Section [21 the set of GK-M equations solved in the code is 
given. Section [3] describes algorithms employed in the code, including the velocity-space discretization 
and integration, the finite-difference formalism of the GK-M system along the mean field direction and its 
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solution by the special technique developed by Kotschenreuther et al. |18j . the parallelization scheme, and 
some additional features of the code. Section 2] presents tests of AstroGK, ranging from linear electrostatic 
problems to nonlinear fully electromagnetic problems. Comparisons to analytic solutions of the linear 
problems enable the verification of the code and nonlinear examples show potential ability to apply the 
code to complicated problems. Serial and parallel performance measurements of the code on cutting edge 
supercomputers are given in Section [5] In Section [SI we present a summary of the paper. 
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2. Gyrokinetic-Maxwell equations 



In this section, we present the gyrokinetic-Maxwell (GK-M) system of equations solved in AstroGK. For 
notational simplicity, we summarize all the symbols and their definitions in [Appendix A 



We first assume that scale separations in space and time are well satisfied such that small fluctuations 
are locally embedded in a background plasma which is slowly varying spatially and temporally. We consider 
a temporally constant mean magnetic field Bq = Bobo. The mean field is pointing almost parallel to z, but 
is allowed to have curvature k = |(bo • V)bo|. We also assume the amplitude of the mean field is constant 
along its direction, bo • V-Bq = 0, but has finite gradient perpendicular to its direction, bo x VBq 7^ 0. Under 
this assumption, particle trapping do not take place. In the presence of a mean magnetic field, we can adopt 
the gyrokinetic ordering and average over the fast cyclotron motion to reduce the Vlasov-Maxwell equations 
to the GK-M equations; see Howes et al. [H and Schekochihin et al. Q for derivations of these equations 
expressly intended for the study of astrophysical plasmas. 

Under the gyrokinetic ordering, the distribution function of particles up to the first order is given by 

fs^(l-'^+'-^-^.).hs + K, (1) 



Tqs f^s 

where Jqs = nos/ {y/T^vth.s)^ c'xjp{—v'^ /v"^^ ^) is the zeroth-order, equilibrium Maxwcllian distribution func- 
tion. The first-order part of the distribution function is composed of the Boltzmann response term, a 
term due to gradients of the equilibrium, and the gyro-center distribution function defined in the 
gyro-center coordinate Upon averaging over the gyro-phase, the gyrokinetic equation evolves 

dh, dhs ( dh, ^ dfos\ qsfosd(xlR^ 

where parallel and perpendicular subscripts refer to directions with respect to the mean magnetic field. The 
perpendicular drift velocity is given by 



2Vts ^° Qs OR, Ba 



where the terms correspond, from left to right, to the gradient-i? drift (L^^ = —d\nBo/dXs), the curvature 
drift, and a nonlinear drift 1^ We have taken the direction of the gradients of Bq and /os and the curvature 
in the —Xs direction. The gyrokinetic potential is given by x = — f ■ and the linear collision term is 
represented by C{hs). The angle bracket { ■ )h., denotes the gyro-average at fixed gyro-center coordinate 

{F{r))n, =i-iF(R. + Yl^] de., (4) 



271 J \ 

where Vg = (Vj_^s, V|| 0^). (The gyro-average at fixed particle coordinate ( • )r can also be defined by 
switching roles of r and Rg.) 

In the GK-M system, the electromagnetic fields are specified by the three scalar functions (f){r, t), (r, t), 
and dBii{r,t)E 

according to: 

dA 

B =S/±Aii X z + SBiiz, E = ~V(j)~—. (5) 



^ Vs = i>, and, normally, is not distinguished from v. See |Appendix A.l| 

^ (/>, V^i A|| , and • A± terms in x yield the E X B drift, a parallel streaming along the perturbed magnetic field, and the 
V-B drift due to , respectively. 

= (Vx X j4x)z. Wc use the Coulomb gauge, which leads to Vx ■ ^x = with the ordering. Then, we can write 
ylx = Vx? X z, and <5_B|| = — V^f in terms of a single scalar function ? . 
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Maxwell's equations in the gyrokinetic limit reduce to the quasi-neutrality condition, and the parallel and 
perpendicular components of Ampere's law: 



E 



Tos 



(p + Qs {hs)rdv 



= 0, 



s •' 



(6) 
(7) 
(8) 



To lowest order, the quasi-neutrality condition and the perpendicular Ampere's law imply the constraints 
on the background plasma of quasi-neutrality and total pressure balance: 



^ nosQs =0, 



d 



=0. 



(9) 



It is advantageous, both analytically and in the code, to Fourier transform the equations only in the 
plane perpendicular to the mean magnetic field. The distribution function and the fields are decomposed as 



fcx 



(10) 
(11) 



with fej^ = (kx, ky,0). The virtue of expressing variables in Fourier space is that the gyro-averaging operation 
becomes a multiplication by a Bessel function. For example, the gyro-average of the gyrokinetic potential 
is given by 

ros2^1s Ji(a,)5B||,fc^ 



(Is «th,. 



\k^R, 



(12) 



where Jn is the Bessel function of the first kind with the argument as = k±V_\_^s/^si taking fc^ = |fc_L|. The 
Fourier coefficients of the fields are now functions of Zg and t, for example 4>k^ = (/>fcj^ (Zg, f). 

In the large-scale limit k±pi <^ 1, Alfvenic perturbations have a gyro-center distribution function hg 
that is largely canceled by the Boltzmann response term {qs4'/Tos)fos (see, for example, Section 5 of Q). 
To avoid numerical error arising from this near cancellation, a complementary distribution function Qs is 
introduced, given by 

9s = hg-^{<j>^v^-A,_)^^. (13) 



To. 



After normalization described in [Appendix A. 2 we finally obtain a normalized version of the GK-M equa- 
tions as follows. The normalized gyrokinetic equation is 



dgkj 



dt 



/Tqs dhk 
m 



8Zs +2-^(i(^)--~''^^}) 



V 



- L-r, 



dA,k 



-j^^V\\^gMas)^^ + CkAhk^,s), (14) 



where =V]^ ^ + Vi^ ^ and the normalized gyrokinetic potential is 



(x) 



E 



Jo{_as)(j}k^ 



-V||,3Jo(as)^|l,fc_L + 



(15) 
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The nonlinear term, given by the Poisson bracket {a,b} ~ {da/dXs){db/dYs) — {db/dXs){da/dYs), is 
evaluated in real space and then transformed into Fourier space (denoted by J-) . The terms in the gyrokinetic 
equation due to gradients of the background plasma, given in square brackets, contribute at the same order as 
the other terms, and are characterized by the parameters L^l, L~^^ = —dlnnQsl dXg^ = —dlwTQs/ dX^, 
and the curvature k. The normalized Maxwell's equations are 

<l>k^ E ^ (1 - To.) - <5i?||,;,, E^^^o.ri, =^g,M(0)(g;,^,,), (16) 



To, 



o« \k^=Y.'i^'^^^^'^^^^9k^,s), (17) 




The plasma beta of the reference species, defined by the ratio of the thermal pressure of the reference species 
to the magnetic pressure of the mean magnetic field, is given by Pq. The operators Al^") to take the nth 
order moment of the distribution function are: 

M^°\gk^,s) ^nos I gk^,sMas)-^<^Vs, (19) 



TT 



M^^\gu^,s) -J— / gk^^sV\\.sMa,)^Av,, (20) 



r3/2 

The function r„s = r„(fes) arises from the integration over perpendicular velocity space of products of two 
Bcssel functions. For n = 0, 1, 2, it is given by 

Toibs) ^Io{bs)e-''% ri{bs)^{Io{bs)-Iiibs))e~''% r2(6,) =2ri(&,) (22) 

where /„ is the modified Bessel function of the first kind, and the argument is bs ~ [k^psY /2 [ij. 

The background plasma must also satisfy the normalized quasi-neutrality and total pressure balance 
constraints: 

E "o^^s =0, L~sl + Y E "o^^o^ (^"„l + ^Tol ) = 0. (23) 



We defer description of the explicit form of the collision operator used in the code to [Appendix B 



as 

it has a rather cumbersome form and is fully documented in j32l |. We mention here the basic properties 
of the operator. The operator is based on the linearized Landau collision operator transformed into the 
gyro-center coordinate. It has second-order velocity derivatives providing diffusion in velocity space and 
conserving terms which include integrations over velocity space. It is constructed to satisfy Boltzmann's 
ii'-theorem and the conservation of particles, momentum, and energy. It contains both like-species collisions 
and inter-species collisions, but the inter-species collisions account only for the collisions of electrons with 
one species of ions with large mass. Note that the linearized collision operator for a given species can be 
made independent of the first-order evolution of any other species. The theoretical basis of the collision 
operator is discussed in detail in f33| . 



6 



3. Algorithm description 



This section describes tlie numerical algorithms used in AstroGK to evolve the GK-M system of Eqs. ((T4)) . 
and (fTBl) - (fT^ . The gyrokinetic equation combined with the field equations together comprise a set of integro- 
differential equations for the evolution of the distribution function g defined in the five-dimensional phase 
space {X, Y, Z,V\\,V±). In this section, the species subscript s is omitted unless necessary. Periodic boundary 
conditions are assumed for the spatial dimensions {X,Y,Z), and the derivatives in the plane perpendicular 
to the mean field, {X,Y), are handled using a Fourier-spectral method. Except for the nonlinear term, 
each Fourier mode is independent of the others in the gyrokinetic equation, so we omit the k± subscript for 
simplicity. In the numerical implementation described here, because fields are calculated separately from 
the gyrokinetic equations, the gyrokinetic equation for each species is essentially independent of that for the 
other species. The mean field parallel direction Z and the time t are discretized by Zi = iAZ (i = 0, • • • , Nz) 
and tn = y^" _i Atj, where AZ is fixed and Atj may vary to satisfy the Courant-Friedrichs-Lewy (CFL) 
condition [sjj for the nonlinear term. Velocity space is discretized with grid points chosen by Gaussian 
quadrature rules for optimal integration, generating nonuniform meshes. 

3.1. Velocity- space integration 

The velocity grid in AstroGK is specified by (A, £',(t), where the pitch anglejfl is A = v'^/v'^, the energy 
is E = v'^ + and the sign of parallel velocity is ct = sgn(w||). The velocity-space integral to calculate 
moments is represented by an integration of some function F{X, E,a, . . .) multiplied by the Maxwellian e~^: 



(24) 



Gaussian quadrature evaluates an integral of a function F{x) with weight W{x) by 



[ W{x)F{x)dx ^^WjF{ 

J'^ .7 = 1 



(x,), (25) 



where Xj is the jth root of the A^th order polynomial, and Wj is the corresponding discretized weight. The 
weights for the Gauss-Legendre and Gauss-Laguerre rules are given by 

<' = f . . ^ ... ^ / ,,,,2 . (26) 



where Pat and L^r are the A^th order Legendre and Laguerre polynomials, respectively [35|- The superscripts 
'Leg' and 'Lag' to Xj and Wj explicitly denote that they are associated with the Gauss-Legendre and Gauss- 
Laguerre rules. The weight function W{x) and the integration range are chosen according to the polynomial. 

For the pitch-angle integration, the Gauss-Legendre quadrature (a = — 1, 5 = 1, W{x) = 1) is immedi- 
ately applied by defining ^ = vl — A: 

Vi-A Jo 

(27) 








*The pitch-angle parameter A is used for historical reasons. If the magnitude of the mean magnetic field changes along 
its direction, particles may be trapped in magnetic wells. Trapped and untrapped regions in velocity space are conveniently 
described by the A coordinate [Tsll . Since AstroGK does not contain this physics, the use of A is not necessary, but is inherited 
from GS2. The pitch-angle variable § = could be used instead. 
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The integration range is changed using a hnear transformation to fit the range of the Gauss-Legendre 
rule. Nx is the number of grid points describing the A grid and is specified by the user input, ngauss: 
Nx = 2 X ngauss. 

A rather careful treatment of the energy integral is necessary because there arc singularities of the inte- 
grand at E = and E = oo which may prevent a simple approximation from achieving spectral convergence. 
To avoid the problem, we split the integration range at i?cut = '^cut i^^o a lower and an upper range and 
change the variable to v from E for the lower range integration: the Gauss-Legendre scheme is used for the 
lower range; and the Gauss-Laguerre (a = 0, b = oo, W{x) = e^^) is used for the upper range. Therefore, 
the integration is approximated by 

/ F{E,...)e-^VEdE = F{v^ , . . .)e-''\v^Av + I F{E, . . .)e^^VEdE 

Jo Jo J 



^Wcut Lcg^^ I^^Log 



2 ^ ^ " V 2 



+ 



(28) 



where 

Gi(x) =F(x2,...)e-"'2x2, G2{x) =F{x, . . .)^/^, (29) 

and the integration ranges are shifted appropriately. We allow users to specify Wcut = vcut, Nj^ = nesub, 
= nesup, and Ne = N'^ + Nj^ = negrid in the code input. 

We also have another mode to evaluate the energy integral in the code called the 'egrid mode', whereas 
the above method is called the 'vgrid mode'. In the egrid mode, the energy integral is calculated by the 
method suggested by Candy and Waltz which is not exponentially accurate. We note, however, that 
if Ne is very small (< 8), we find empirically that the egrid mode may give better results than the vgrid 
mode. Therefore, the optimal choice of energy grid mode is governed by the simulation parameters. 

Further discussion of our velocity-space coordinates can be found in (36| . 

3.2. Time integration 

The gyrokinetic equation is symbolically denoted by 

^ = Cg + Cg+U{g,g) (30) 

where C is the linear term except the collision term, C is the collision term which is also linear, and Af is the 
nonlinear term. We consider the time derivative using first-order finite differentiation for the linear term, 
treating the collision term by the implicit Euler method, and handling the nonlinear term explicitly by the 
third-order Adams-Bashforth method (AB3): 

^ Cg-W, + Cg"+\1 - n) + Cg^+^ + ^AA" - -AA"-i + ^M""' + 0{At) (31) 

where Af" = J^{g" , g""). The time-centering parameter rt (fexp in the code input) may be chosen within 
the range < < 1, where rt = I {rt = 0) represents a fully explicit (implicit) scheme. If vt < 1/2, the 
scheme is stable for any At. We mainly use an implicit trapezoidal rule rt = 1/2 which is second-order 
accurate and free from time step restrictions due to the linear term. Hereafter, we fix rt = 1/2. 

Note that Eq. (j3ip is linear with respect to g"^^ due to the explicitness of the nonlinear term. We then 
employ a Godunov splitting technique |37|, which is first-order accurate in At. to separate the collision term: 

„(*) _ a" + 23 4 5 

9 9 + 9 ^ ^_^n „ ^ j^_^n-2^ (32) 



gn+1 „ gi*) 



At 2 12 3 12 

Cg"+i +C'(At), (33) 



At 
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which greatly reduces the size of the matrix to be inverted. Solving for g , we obtain 



1 + ^c) + At (|aA" - \n-' + ^N-' 



0{Af). (34) 



The method is first-order accurate in time. The use of AB3 for the nonlinear term is to make nonlinear runs 
stable. Note that the first time step for the nonlinear term is evaluated by the (explicit) Euler method (which 
is also first order in At), and the second timestep is evaluated using the second-order Adams-Bashforth 
scheme (AB2). 

A second-order accurate method may, in principle, be derived by applying a higher-order scheme for the 
collision term as well, and by using a Strang splitting for the operator splitting. The first two steps for 
the nonlinear term could also be computed using a higher-order method. These ideas are not implemented 
in the current version. 

3.3. Gyrokinetic solver 

We now describe the implicit advance of the linear terms in the gyrokinetic equation. Basically, the fields 
in the gyrokinetic equation can be obtained by a separate procedure, and the gyrokinetic equation becomes 
a differential equation with the given fields. Thus the coUisionless gyrokinetic equation is written as 

9-t+-^9^+-og = h-^+b,. — + bo-^ + S, (35) 

where ^ — {(f>, A^\, 5B\^), coefficients az,o and 5t,z,o arc functions of V± and V\\, and S contains the AB3 
nonlinear term. The collision term is always consecutively applied after (|35|) . but separately (Section l3.3.4p . 

We use a compact finite-difference method for evaluation of d/dZ to achieve up to second-order accuracy 
with keeping the matrix bi-diagonal. To take into account global information, compact finite differencing 
schemes use the derivatives at neighboring grids to evaluate the derivatives. A general formula of the 
compact finite-difference scheme using two neighboring grids is 



i^±^ + 0{AZ), (36) 



for > 0. (V^i is a coefficient of dg/dZ. See (jl4[) .) Information at i — 1 instead of i -t- 1 should be 
used for < 0. (In the following discussions, we show equations for V|| > only.) < < 1 is the 
space-centering parameter specified by bakdif in the code. We fix rz = in the following discussion, and 
therefore second-order accuracy is achieved. 

Combining the trapezoidal rule for the time derivatives with the compact scheme for the space derivative^ 



aT^ + AZ + 



, > ^l + l 



At AZ ''"-Ki+^^i (37) 

where i -f 1/2 denotes the average value of the variables at i and i + 1 grids, and similarly for n + 1/2. Then, 
the gyrokinetic equation is cast into the following symbolic form: 



A,g^ + A2gt,,+B,g^+'+B2g:+' 



Di • + D2 • *r+i + El ■ + E2 ■ + + (38) 



^ The combination of the trapezoidal rule and the second-order compact finite differentiation was first suggested by Beam 
and Warming 1391 . Note that the so-called Beam- Warming scheme refers to a different scheme. 



9 



3.3.1. Kotschenreuther's Green's function approach 

Kotschenreuthcr ct al. developed an efficient way to solve the gyrokinetic equation by breaking the large 
matrix to be inverted into a number of small matrices [ist - In the method, fields at the future timestep n + 1 
are obtained using a Green's function formalism to decouple parts of the matrix related to velocity-space 
integrations in the gyrokinetic equation. Consequently, the matrix to be inverted becomes Nz x Nz at 
each velocity grid for each species. (Due to the periodic boundary condition, i = 0, Nz modes are not 
independent.) The actual scheme (40j that was originally implemented in GS2, and has been inherited by 
AstroGK, differs slightly from that described in the original paper by Kotschenreuthcr et al., as described 
below. 

Because (|38p is linear with respect to variables at timestep n+l, the solution to the equation may consist 
of any linear combination of solutions to parts of the equation. Thus we may split the solution at timestep 
n+l into two pieces, g"'*'^ = g^'"'^' + g^^\ each satisfying the following equations (here we ignore the spatial 
index i in the interest of clarity): 

Ag"" + Sg^'"'^) =D ■ + E ■ *" + S", (39) 
Bg(h) . *(*)^ (40) 

where = + ij>(*). An inhomogeneous piece g^™^) depends only on the known quantities at timestep 

n, thus is immediately solved, while a homogeneous piece g^^^ depends on the fields at timestep n + l. The 
unknown portion of the fields may be solved as a separate step using a Green's function approach. 
A formal homogeneous solution is given by 

.B-.. (I) *... + (^).<-. + (^)»i-. («) 

where (Sg/Scf)), {6g/SA\\), and {5g /S{SB^\)) arc called the plasma response matrices. Using the response 
matrices, the field Eqs. ((T6)) - (|T8l) can be written as 



(42) 



Pll 


P12 


Pl3 


P21 


P22 


P23 


P31 


P32 


P33 
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where 



Pii 

Pl2 
PlZ 
P2I 
P22 
P23 
P3I 
P32 
P33 

with / being the Nz x Nz identity matrix and 

Qi =r E ^ (1 - To.) - 5Bl( E IsnosTu ^ 9^^^°' (^i'""') , (52) 

s ^'^ s s 

E l^^osM^'^ (g^)) , (53) 
Q3 = - 0" E 9sno.ri, - 6B\( + E '^OsTo.ra, j - X(^) (sf . (54) 

By solving the field Eq. (|42|), we obtain and ultimately Given *"+\ we solve ([38]) for 

This is equivalent to solving with • replaced by E ■ VI>"+i. Finally, the full procedure to solve the 
gyrokinetic equation becomes: 

1. Solve (|39p and apply the collision term to obtain the inhomogeneous part of the distribution function 

^(inh)^ 

2. Solve dHI) for and obtain 

3. Replace by 5"+-^ and E ■ by E ■ '$'"+1 in p9|l . solve it, and again apply the collision term to 
get 

Up to this point in the subsection, we have ignored the i index for notational simplicity. In fact, the 
gyrokinetic equation and the field equations should be solved at all Z grids simultaneously. The field vector 
in is actually a vector of length NfNz, where Nf is the number of fields evolved. An electrostatic 
simulation requires only the evolution of and so has Nf = 1, whereas a fully electromagnetic simulation 
evolves (f>, A^^, and SB^^ and so has Nf = 3. Each of the elements of the P matrix in represents an 
Nz X Nz matrix, so the entire P matrix is of size {NfNz) x (NfNz). Similarly, the Q vector is of length 
NfNz. 
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s '- 

^E-o.^^^^(t 
-t-E.---(||)^ 



\S{SB\\ 



-E 



qsnosT2sI 



f 5gs 
U(<5i?||) 



(43) 
(44) 
(45) 
(46) 
(47) 
(48) 
(49) 
(50) 
(51) 



To evaluate the computational efficiency of Kotschenreuther's approach, let us compare it to a brute- 
force approach to solving the GK-M system. A brute-force approach requires the inversion of a dense 

{NzNxNENs)-size square matrix (where Ns is the number of species), which generally takes O (^{NzN\NeNs) 

operations. On the other hand, Kotschenreuther's method requires the following: (a) for the gyrokinetic 
equation, {N\NeNs) inversions of bi-diagonal A^^-size square matrices, which costs 0{N zN \N eN s)\ and, 
(b) for the fields, the inversion of the matrix P in (|^^ . a dense {NfNz)-sac square matrix. For a fixed 
timestep At, however, the matrix P does not change, so this matrix inversion need only be performed once 
during an initialization stage. Therefore, during each timestep the field solver requires only an O (^{NfNz)'^) 
matrix multiplication operation. Ignoring the factor Nj since Nf < 3, Kotschenreuther's approach requires 
C (^N'^NxNeNs) operations per timestep, much more efficient than the brute-force approach. 



3.3.2. Response matrix 

A remaining task is to determine the response matrices, given in (j4ip . For clarity, we rewrite (j4ip with the 
i index (in the code, the i index is counted from — ntgrid+ 1 to ntgrid for T^i > with 2 x ntgrid = Nz). 



at' 



= p- 



) 

2 



4^*) / 



E, 



SB,, 



( 5P\*l \ 
5P\*l 



(55) 



where B and E^^A\\,5B\\ are all Nz x Nz matrices. To obtain the response matrices Eij,^A\\,5B\\ , we solve 
(1551) column by column. If we put trial functions 4>i = (where Su is the Kronecker's delta), A\\^i = 0, 
5B\\^i = in the RHS of ([55)) . a solution contains the Zth column of B^^E^. By running / from 1 to Nz, 
the full response matrix is calculated. The same procedures are carried out to obtain the other response 
matrices. This is fundamentally a Green's function method^ Note that we can use the same routine to get 
the response matrices as that to solve ([38]) by replacing 5" = 0, = 0, and *"+^ with the trial functions. 



3.3.3. Boundary conditions 

The boundary condition in AstroGK in the Z direction is always periodic, thus it is rather trivial to 
implement the boundary condition compared with GS2. For the field equation, periodicity is immediately 
satisfied if the trial functions described in Section 13.3.21 satisfy periodicity. For the gyrokinetic equation, 
periodicity is imposed in the matrix B. Let us write the gyrokinetic equation as 





/B2 


••• 


Bi\ 






Bi 


B2 ■•• 







Bx = y where B = 





Bi B2 


■■■ 


(56) 




U 


... 


Bi B2) 





®The method can be trivially extended to Fourier space along Z when the cocfBcicnts of the linear terms are independent 
of Z . This may be advantageous for some applications but is not described here. 
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The matrix B is readily LU decomposed by forward elimination to yield: 



/i ... ^(-t) \ 




thus the equation is solved. 
3.3.4- Collision term 

The collision term is handled by operator splitting as shown in (j33p . After splitting, one must com- 
pute (1 — AiC)^^, where C is an advanced model collision o pera tor designed for gyrokinetics (Section [2] 
and [Appendix B| . The numerical implementation is given in [32l |. Here we provide a brief overview of the 
key features. 

The collision operator can be written schematically as 

C = Cl+Cd+Wl+Wd, (58) 

where Cl and Cd are second-order differential operators describing pitch-angle scattering and energy dif- 
fusion, respectively, and Z^l and Uu are integral operators designed to make C conserve particle number, 
momentum, and energy. Since the collision operator includes velocity-space derivatives and integrals, but 
no coupling in Z, it consists of a dense {N\Ne) x {N\Ne) matrix. 

Discretization and inversion of the operator 1 — AtC are done carefully to minimize computational 
expense and to preserve numerically the analytic conservation properties. First, we discretize the differential 
operators Cl and Cd on a three-point stencil using a novel discretization scheme that guarantees exact 
conservation properties on AstroGK's nonuniform velocity-space grids. This is accomplished by incorporating 
the Gaussian integration weights in the discretization, leading to a first-order accurate scheme across most 
of the velocity-space domain. 

Next, we apply another Godunov splitting so that we can invert two reduced matrices: 1 — At (Cl +Z^l) 
and 1 — At (Cd + Z^d)- Both Cl and Cd can be written as tridiagonal matrices, but Z^l and Uu are in general 
dense. However, both Ut, and Ud can be expressed as tensor products, so that we can use the Sherman- 
Morrison formula (4ll . |43 | to reduce greatly the numerical expense of the matrix inversions. Therefore, the 
inversion (1 — AtC)~^ is ultimately computed by inverting a small number of tridiagonal matrices. 

3.4- Parallelization scheme 

Decomposition of the gyrokinetic distribution function is accomplished using a flexible parallelization 
scheme that allows for several different memory layouts. In principle, the gyrokinetic distribution function 
is 

g{k^,ky,Z,\,E,a,s), (59) 

comprising a scalar function defined on seven-dimensional discrete space. Since field quantities, 0, A|| , and 
(5-B|| , are three-dimensional and therefore have much smaller memory sizes than the distribution function, 
a copy of the full three-dimensional field information is held by each processor^ Thus we focus on the 
parallelization scheme of the distribution function in this section. 



^This can be a problem if we consider extremely large simulations. We may take a large number of processors Nproc, such 
that Nf;^Ni^^NxNi;Ns/Nproc <SC Nf^^Nf^^ , as long as the work load on each processor is large enough to scale up computational 
efficiency. In such cases, data for the fields rather than the distribution function dominate memory usage. 
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The update of the distribution function using the gyrokinetic equation, as described in Section 13.31 is 
accomphshed in several stages, including the main stage fSection l3.3.ip . collision term stage (Section l3.3.4p . 
and nonlinear term stage. Each of these stages performs operations over different dimensions of the seven- 
dimensional distribution function data. For example, the linear terms of the main stage involve compact 
finite differencing in Z\ the collision term employs velocity-space derivatives involving A, cr, and E\ and the 
nonlinear term employs Fourier transforms requiring information along the components of the perpendicular 
wavenumber, and ky. We refer to the dimensions required for the current operation as the active 
dimensions, while the remaining dimensions are inactive. 

The basic parallelization strategy is to place the seven-dimensional distribution function data into an 
array in memory such that the indices of the array associated with the active dimensions come first. Then all 
of the data associated with the inactive dimensions are combined into a single, final array index in the order 
specified by the variable layout. The data array is then decomposed across A^proc processors by splitting it 
up evenly over the final index containing all of the inactive dimensions. After one stage of the calculation is 
completed, the distribution function data is redistributed in preparation for the next stage of the calculation. 
This redistribution is accomplished by a series of one-to-one communications, carefully designed to perform 
only necessary communications and to avoid communication deadlocks. During the initialization for each 
run, the code determines this redistribution scheme. 

For the gyrokinetic solver (Section [33]), the redistribution of the seven-dimensional distribution function 
data into a three-dimensional data array is performed as follows. Since compact finite differences are used 
along the mean magnetic field (Z), the first index of the data array corresponds to the Z dimension. The 
second index is associated with a since most operations arc common to the same |?;|||. The remaining five 
dimensions of the distribution function data arc combined into the third index of the data array. The order 
of this combination may be chosen by the user to yield the best parallel performance by setting the character 
input variable layout, which consists of five alphabetical characters indicating the order of the dimensions. 
At present, the following options are available for layout: 'Ixyes', 'lyxes', 'lexys', 'yxels', and 'yxles'. 
For example, if layout = 'yxles', the data arc combined into the final index such that ky changes first, 
then fca;, then A, then i?, and finally s. Such a layout is advantageous, especially for collisionless simulations, 
when redistribution to calculate the collision term is unnecessary. If the number of processors A^proc is 
chosen such that all k^ and ky indices are on the local processor [requiring mod {N\NeNs, Npi-oc) ~ 0], 
then redistribution for the nonlinear term is unnecessary. 

For the nonlinear term, if the number of processors A^pioc or the layout is chosen so that all k^ and ky 
indices are not on the local processor, then the code redistributes the distribution function data over the 
processors into a two-dimensional data array with k^ (or ky) as the first index and all remaining inactive 
dimensions into a second index to be split over processors. In this case, Z and a are the first dimensions to 
be combined, and then remaining dimensions follow the layout specified in layout, except for k^ (or ky). 
It then performs the one-dimensional fast Fourier transform in k^ (or ky) to compute the nonlinear term. 
For the collision term, the code redistributes the data with A and cr first to obtain the pitch angle ^ for 
the pitch-angle scattering term, and with E first for the energy diffusion term. Therefore, the more physics 
that is included in the calculation, the more redistribution that occurs, and the performance of the code 
diminishes as the complexity of the problem is increased. For large simulations employing iVpioc > 1000, one 
must choose the number of processors and layout carefully to prevent the computational effort required for 
data redistribution from becoming a bottleneck. 

3. 5. Additional features 

We list here some additional features implemented in AstroGK which have not yet been described. 
3.5.1. Driven simulation 

One might want to drive the system externally. For instance, turbulence simulation is quite often driven 
externally at large scale to achieve a driven-dissipative system where stationary turbulence inertial range 
spectrum will be observed. 
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In AstroGK, wc can add yiantcnna drive an antenna current in the direction parallel to the mean 

magnetic field. In this case, the normalized parallel Ampere's law (jl7p is modified as 

^ (ylll,;,^ + ^|:f^""^) = ^ qsnosM^'H.9k^,s). (60) 

Property of the antenna is specified by its amplitude A^\o, frequency ujq, and wavenumber fco^ 

A\lkTr ^lloe"'^"''*"''"''^^. (61) 

3.5.2. Timestep 

The timestep At is variable in AstroGK. For linear runs, the numerical scheme is unconditionally stable 
as long as rt < 1/2, so adjustment of the timestep is necessary only to achieve an acceptable accuracy and 
does not affect the stability. For nonlinear runs, the nonlinear term is handled explicitly and therefore the 
code automatically adjusts the timestep to meet the CFL condition according to the nonlinear drift velocity 
in the plane perpendicular to the mean magnetic field. 

We estimate an acceptable timestep based on the CFL condition as 

Aa; Av 

AtcFL = CcFL min I ^, — ^ I (62) 



max I v^^^ 



where the nonlinear drift velocity is given by the third term in Ax ~ ^Tr/kx^nrnx, Ay = ^ir/ky^yaa-x, 
and CcFL is a user input constant. We check if the current time step is greater than A^cfl at every timestep, 
and divide At by a constant factor (2 by default) if At > AtcFL- We also increase At by multiplying the 
same factor if At is substantially smaller than A^cfl- 

If At changes, some matrices must be updated accordingly, so the initialization routine for the matrices 
is called at this timing. The first two steps for the nonlinear term after the timestep change are again the 
Euler method and AB2. 

3.5.3. Diagnostics 

AstroGK can write out the full data of the fields and the distribution functions. Frequent output of the 
full data is undesired due to the consumption of large amounts of disk space. Instead, the code computes 
reduced diagnostic data on the fly, including the following: 



Macroscopic quantities, such as the entropy, energies, heating rates, and certain averages of the fields 
and the moments [l|, ■ 



Spectra in position and velocity spaces [27l - l29f . 

Transfer functions of energies in position and velocity spaces [2£ 

Linear frequency and growth rate [l|, [sij . 



Error estimate of velocity-space resolution [321 136|. 



Some of the output utilize NetCDF or HDF5 library to make structured binary data for portability. 
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4. Code verification 

In this section, we provide some examples of AstroGK test mns, ranging from tlie simple linear, electro- 
static (A|| — ~ 0) problems to fully nonlinear, electromagnetic problems. Validity of the linear physics 
within the code is demonstrated by the comparison of linear AstroGK results with analytic solutions. For 
nonlinear problems, comparison to an analytic solution is generally not simple, so here we compare with 
the nonlinear results of an independent reduced magnetohydrodynamics (MHD) code. In the large-scale 
limit k±pi <C 1, the gyrokinetic-Maxwell equations simplify to the equations of reduced MHD so such a 
comparison is useful to validate the nonlinear behavior of AstroGK. It is worth noting that the gyrokinetic 
simulations contain much richer physics and show interesting phenomena, but a detailed analysis of such 
behavior is beyond the scope of this paper and is left for future studies. 

Unless otherwise stated, the plasma is assumed to be quasi-neutral with noi/noc = —Qi/Qc = 1- Free 
parameters are the ion-to-electron mass ratio mi/mo, the equilibrium ion-to-electron temperature ratio 
Toi/Toe, and the plasma beta f3s of the reference species s. 

We comment on the number of grid points in the perpendicular plane of position space compared to the 
corresponding number of Fourier modes kept in the pseudo-spectral method. The number of grid points 
in the perpendicular plane and Ny are specified by user. Dealiasing requires that Fourier modes must 
be discarded according to the 2/3 rule [i^, giving the number of perpendicular wave modes in AstroGK 
Nk-^ ~ 2/iNx and Nk^ ~ l/3Ny. Note that the reality constraint on the complex Fourier coefficients means 
that the modes in the lower half {ky < 0) of the {k^, ky) plane are not independent and need not be kept. In 
addition to the standard operating mode that nonlinearly evolves many Fourier modes, AstroGK also allows 
a very useful linear mode where = Ny = 1 and only \k±] needs to be specified. 

^.1. Linear physics of Alfven waves 

Testing the ability of AstroGK to model accurately the linear physics of Alfven waves, including their 
dispersion and coUisionless damping at sub-Larmor radius scales, is critical to demonstrating the validity 
of the code. The linear coUisionless gyrokinetic dispersion relation [l[ describes the physical behavior and 
provides analytical solutions for comparison to numerical results. It is worth noting here that the results 
of the linear coUisionless gyrokinetic dispersion relation agree with the full Vlasov-Maxwell hot plasma 
dispersion relation [i^l in the gyrokinetic limit, as has been demonstrated by Howes et al. [l[. 

The tests in this subsection explore the linear plasma response in a uniform plasma with a straight 
equilibrium magnetic field, so that L^^ = = L^J = k = 0. All tests use a realistic ion-to-electron 
mass ratio for protons, mj/me = 1836. The linear frequencies and damping rates presented here are 
normalized by the MHD Alfven frequency wa (defined later on), denoted by the overbar, uJ = lj/lua- With 
these simplifications, the normalized complex eigenfrequency of the linear coUisionless gyrokinetic dispersion 
relation is a function of only three parameters, uJ = Toi/Too). 

4-.1.1. Linear Laplace- Fourier transform solution 

Alfven waves can be driven in AstroGK by applying a parallel antenna current throughout the simulation 
domain (see Section 13. 5p . The parallel vector potential due to the antenna drives a single wavevector 
feo = k±x + fc|[£ at a frequency ujq with amplitude ^||o- Using the parallel wave number of the drive, we 
can define the Alfven frequency for the normalization as wa = ^'||Wa, where the Alfven velocity is defined by 
VA = Bo/y/^onoirrii. 

The amplitude response of the driven linear gyrokinetic system can be compared with an analytical 
Laplace-Fourier transform solution, given by (jC.lOp in [Appendix C[ For the example presented here, the 
parallel magnetic field perturbation was forced to be zero, SB\^ = 0. In Fig.[T] we plot the analytical Laplace- 
Fourier solution of the gyrokinetic system (solid line) and compare it to the output of AstroGK (dashed line). 
Parameters for the run are k±pi = 1, /3i = 1, and Tqi/Tqc ~ 1; the system is driven with amplitude A||q = 20 
(in arbitrary unit) at frequency ZUq = 0.9 from zero initial conditions for the fields and perturbed distribution 
functions g^- From the linear gyrokinetic dispersion relation, the linear eigenvalue for these parameters is 
uJ = WR + iaJi = 1.4057- 0.0730041. The numbers of grids are {Nz,Nx,Ne) = (32,8,32). These choices 
achieve a converged result with minimal computational effort. Low collision frequencies for the same-species 
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Figure 1: Evolution of the amplitude vs. time from a linear AstroGK run (dashed line) with <5i?|| = for parameters 

kj_pi = 1, /3i = 1, and Toi/Too = 1. The Laplace-Fourier solutions of the gyrokinetic system (solid line) and a simple model 
system (dotted line) are given for comparison. 
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collisions are chosen to yield weakly coUisional behavior, with v-Juja = ^'c/'^a = 2 x 10~^. The close 
agreement of the solid and dashed lines in Fig. [T] — faithfully reproducing even the small amplitude, high- 
frequency oscillation — demonstrates the ability of AstroGK to model accurately a driven linear gyrokinetic 
system. 

To determine the effective frequency oJr and damping rate — aJi of the linear response to the driving 
in an AstroGK simulation, the Laplace-Fourier transform solution can be found for a model linear system. 
The model system treats the time evolution of A|| as a linear operator with a complex eigenvalue given by 
— i(wR + iwi) and includes the driving term, 

dAw 

^ = -ic^A||+A||oe-''^°*. (63) 
The Laplace-Fourier transform solution yields the time evolution of the amplitude: 

After normalizing frequencies appropriately, when oJq = 0.9, this function yields the fit (dotted line) in 
Fig. [T] with ^||Q = 18.2 and uj = 1.406 — 0.073i. Note that fitting the oscillation of the solution at the beat 
frequency (wr — wq) allows a precise determination of the resonant frequency. The fractional error in the 
damping rate is larger, in particular due to the difficulty of fitting the exponential decay in the presence 
of the higher frequency oscillations arising in the gyrokinetic system — compare the difference between the 
simple model (dotted line) and the gyrokinetic solution (solid line). Estimating the error in the frequency 
and damping rate based on these fits, we determine values of oJr = 1.406 ± 0.004 and — oJi = 0.073 ± 0.003. 

4.1.2. Linear frequencies and damping rates 

The model Laplace-Fourier transform solution given by (j64[) is used to determine the normalized fre- 
quencies (wr) and damping rates {—'(^i) of linear modes on the Alfven branch, including the kinetic Alfven 
wave at k±pi 3> 1, over a wide range of plasma parameters. In the upper panels of Fig. [5J we present the 
normalized (a) frequencies and (b) damping rates of the Alfven solution vs. perpendicular wavenumber kj_pi 
for varied values of ion plasma beta /3i = 0.01, 1, 100 at Toi/Too = 1; in the lower panels are the normalized 
(c) frequencies and (d) damping rates of the Alfven solution for varied values of ion-to-electron temperature 
ratio Toi/Too = 0.2, 1, 100 with /3i = 1. The numbers of grids are {Nz, Nx,Ne) = (32, 8, 32). To ensure that 
structure in velocity space does not reach the velocity grid Nyquist frequency, the same species collisionalities 
are set in the range to 0.1|wi,s| ^ ^ I'^i.sl for each run. 

4.1.3. Linear ion-to- electron heating ratios 

Using the heating equations for gyrokinetics [l[, we calculate the ion-to-electron heating ratio Pi/ Pc from 
the linear coUisionless gyrokinetic dispersion relation and compare it to the results of AstroGK. The results 
are presented in Fig.|3]for parameters /3i = 10 and Tq-JTqc = 100, chosen to give the heating ratio that varies 
by many orders of magnitude around k±pi ^ 1. The resolution for these runs is {Nz, N\, Ne) — (64, 64, 32). 
The results of AstroGK show excellent agreement with theory over seven orders of magnitude in P{/ P^. 

4-. 2. Linear Ion- Temperatm^e- Gradient (ITG) instability 

Here we describe linear drift-wave dynamics in the coUisionless limit [45|, |4^ to validate the electrostatic 
calculation of AstroGK with a Boltzmann electron response. We assume Lg^ = k = L^^^ = 0. Considering 
an ion distribution function with exp [i(fc • r — ut)] dependence, from ([2]) we find: 



Toi uj — kiiVw 




th.i 



Jo{ai)Iot, (65) 
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Figure 2: The normalized (a and c) real frequency ZJ^ = ujj^/ujx and (b and d) damping rate — Zjj = —coi/uij^ vs. fcxPi for 
varied ion plasma beta /3i = 0.01,1,100 with fixed Toi/Toc = 1 (a and b) and for varied ion-to-electron temperature ratio 
Toi/Toe = 0.2, 1, 100 with fixed ft = 1 (c and d) from the gyrokinctic dispersion relation (solid line) and linear AstroGK 
simulations (open squares). 
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Figure 3: Ratio of the ion-to-electron heating Pi/Pc- Results are shown for ft = 10 and Toi/Toc = 100. The analytical 
results from the linear coUisionless gyrokinetic dispersion relation (solid line) are compared to AstroGK results (circles), showing 
excellent agreement over seven orders of magnitude in Pi/ Pa- 
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Figure 4: Dispersion relation of the slab ITG instability under the code normalization (see [Appendix A.2[ . Left: frequency 
(ijr), Right: growth rate (idi). Lines are from dispersion relation II67II and symbols are from AstroGK. 



where we have introduced the drift frequency: 

uJ*T^^—j^ ^ ■ (66) 

Plugging this into the quasi-neutrahty condition with the Boltzmann electron response h^. = 0, we obtain 
the following dispersion relation: 



gc»Oc qfnpi 

Zi ' rr^ rji " Ui "^ii ~~ \ "^i / ' rri 



roiCiS(Ci) + 



iroi + hTu ] CiS(Ci) - Ci'roi (i + CiS(Ci)) 



(67) 



where Ci — a;/(fc||Uth,i), and S is the plasma dispersion function |47l |. 

We excite a perturbation of hi oc /oi which generates an electrostatic potential perturbation. A|| and 5B^^ 
are forced to be zero throughout the simulation. For the following parameters, Tqi/Tqc = 1, {k^pi^kyPi) = 
(0, 1), the numerical solution to the dispersion relation (j67[) and the eigenvalues w obtained from AstroGK 
are shown in Fig. 2] We take {Nz,N\, Ne) = (32, 16, 16) for all runs. The figure shows perfect agreement 
between the numerical solution and the theory, even though reproducing a slowly growing mode with small 
fc|| is particularly challenging since AstroGK uses a finite-difference scheme in the Z direction. 

4-3. Collisions and velocity-space resolution 

Here we demonstrate the accuracy of the collision model employed in AstroGK and the spectral con- 
vergence of the velocity-space integration. For the convergence studies, we again consider the electrostatic 
plasma slab with a background ion temperature gradient and a Boltzmann response for the electrons. We 
focus on the case with Toi/Too = fc||iToi = k±Pi = 1, and we use {Nz, Nx, Ne) = (256, 128, 256) as our base 
case resolution. We then independently vary Ne and Nx and calculate the relative error in the ITG mode 
frequency and growth rate. 

Results from the convergence study are given in Fig. [S] The relative error, e, is defined as 

- (68) 



where Wagk is the complex frequency computed in AstroGK and co is the analytic frequency. We see that e 
(dashed line) is less than 1% for Ne }^ 10 and Nx > 3. Also, the relative error exhibits the exponential 
convergence indicative of a spectrally accurate discretization scheme. Accuracy at large number of velocity- 
space grid points is limited only by computational precision, as we see for Nx = 64 (double precision is used 
here). 

Also shown in Fig. [5] is an error estimate calculated on the fly as a velocity-space resolution diagnostic 
in AstroGK (solid line). A detailed description of the diagnostic is given in [36[. The basic idea is to 
calculate the fields using both the standard velocity-space integration scheme and a less accurate integration 
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Figure 5: Error estimates in ITG frequency and growth rate as the number of velocity-space grid points is varied. Left: Ne 
dependence, Right: the number of pitch angles (=2Nx) dependence. The solid line is obtained from the AstroGK velocity-space 
error diagnostics described in the text, and the dashed line is the relative error given by II68II . The error decreases faster than 
a power law, a result of the spectral accuracy of the velocity-space integration. 



scheme that involves dropping a velocity-space grid point and recalculating integration weights. By using 
the reduced grid and corresponding weights, the spectral accuracy is lost. Consequently, the error estimate 
is quite conservative: it is essentially computing the error for a simulation with half the order of accuracy 
(similar to using half as many grid points). In fact, we see in Fig. [S]that the error estimate agrees rather 
well with the actual error at half the number of grid points. The qualitative trend is correct, but one must 
realize that the quantitative error estimate is conservative. 

4-. 3.1. Slow mode damping 

For the collision operator verification, we consider slow mode damping in the low k±pi, high /3i limit. 
Here, analytic expressions can be obtained in both the strongly collisional (fc|[Anitp ^ 1) and coUisionless 
(^||Amfp ^ 1) regimes, where Amfp = fth i/vi is the ion mean free path. The complex frequencies are given 
by 



\k\\ VA 



(69) 



for fcyAmfp > 1, and 



for fc|| Anitp <C 1, with ; cx wth.iAmfp being the parallel ion viscosity. From these expressions, we see that the 
damping in the strongly collisional regime (j70p is due primarily to viscosity, while the coUisionless regime 
(|69p is dominated by Barnes damping. 

In order to isolate the slow mode from the Alfven wave, we take = A|| =0 throughout the simulation. 
The electron dynamics can be neglected because of the high /3i. Consequently, the system is described by 
the ion dynamics coupled with the dB\\ fluctuation (see Section 6 in Q). We initially launch a perturbation 
of the form, Sfi cx (V^^/'^^thi ~ l)/oi5 which generates SB^^ perturbation, and measure the damping rate of 

In Fig. ini we plot the collisional dependence of the numerically obtained slow mode damping rate for 
k±pi = 10~^, (3i = 100. For the most coUisionless case which requires the highest resolution in velocity space, 
we take {Nz,N\, Ne) = (32, 32, 64). We find quantitative agreement with the analytic expressions ((69l) and 
(|70p in the appropriate regimes. In particular, we recover the correct viscous behavior in the fc||A„ifp ^ 1 
limit (damping rate proportional to ^||^i), the correct collisional damping in the fc||Anitp ~ 1 limit (damping 
rate inversely proportional to /i||,i), and the correct coUisionless (i.e. Barnes) damping in the fc||Anitp S> 1 
limit. The proporionality constant c = ^||,i/(i'th,iAmfp) estimated from the simulation is c ~ 2.5, while 
c ~ 0.9 from the Braginskii's analysis [48j |. 
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Slow Mode Damping 




Figure 6: Damping rate of the slow mode for a range of coUisionalities spanning the coUisionless to strongly coUisional regimes. 
Dashed lines correspond to the theoretical prediction for the damping rate in the coUisional (fc||Anifp <C 1) and coUisionless 
{fcyAjnfp 2> 1) limits. The ion parallel viscosity is estimated by fitting numerical data with the theory as ; fa 2.5vth,iAmfp- 
The solid line is the result obtained numerically with AstroGK. Vertical dot-dashed lines denote approximate regions (coUisional 
and coUisionless) for which the analytic theory is valid. 



4.4- Linear tearing instability 

A tearing instability (49l | is one kind of magnetic reconnection process driven by the free magnetic energy 
stored in the current sheet configuration, and is a fascinating example to study in the gyrokinetic framework. 

In coUisional plasmas, inter-species collisions producing the resistivity allow topological changes of the 
magnetic field lines. The singularity occurs around the magnetic neutral line in the ideal limit, which is 
regularized by the finite resistivity. This is a standard boundary layer problem. Th e g rowth rate and 
boundary layer width scalings with respect to the Lundquist number S are obtained in [49[. 

As the collisionality is decreased, the boundary layer becomes narrower until kinetic effects come into 
play, such as the Hall effect (ion inertia), FLR effects, and electron inertial effects. Therefore, the scaling law 
should be altered in such weakly coUisional plasmas. In most situations of interest in fusion and astrophysical 
plasmas, these effects can play a crucial role in the problem. 

AstroGK includes full collision physics, and therefore correctly captures the macroscopic resistivity. (The 
resistivity is given by 77 sa 0.38/ioi'orfoi where Vc is the electron collision frequency and c?c is the electron skin 
depth [50|.) We provide here a scaling study of the problem in purely two-dimensional setting (d/dZ ~ 0) 
as the collisionality is varied, effectively varying the Lundquist number S. We consider an equilibrium 
distribution function of electrons as a shifted Maxwellian 6fc c>c Vj[/oc to give the following current sheet 
configuration: 

-2 f Lx/'^ 



= A'^^ cosh-^ j 5h(x) (71) 

where 

tanh^ [t:^) + tanli^ (xT^: - 27r) - tanh^(27r) 



2tanh^7r - tanh^(27r) 
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is a shape function to ensure periodicity in the box sized ~ 3.27ra, Ajj^ is a constant. We take a = SOpi 
so that the kinetic effect is relatively weak. We perturb the system with kyU — 0.8 (giving a standard 
stability index [i^ A'a ~ 23.2), and observe the linear growth rate. We fix {Nx,Ne) = (20, f 6), and 
varies from 512 to 2048 to resolve the current layer. 

Figure [7] shows the scaling of the growth rate and current layer width as the Lundquist number S is 
increased. The Lundquist number is defined by S' = ^ioo-va±/v = 2.63(i/oTA)~^((ie/a)~^ with ta = a/vA±, 
so increased Lundquist number corresponds to decreased electron coUisionality Vc- The growth rate is 
normalized by the standard MHD time scale ta. The Alfvcn velocity va± is measured by the peak value of 
the background magnetic field given by ([7T|) . 




lo' lO"* lo' 10* lo-* 10'' lo' 10*" 

Lundquist Number: S Lundquist Number: S 



Figure 7: (Color online) Normalized growth rate (left) and current layer width (right) scaling against the Lundquist number 
S. Those obtained from AstroGK (symbols) and from the two-fluid model (lines) are shown for three sets of (/3e, Tii/n^e). 



For a fixed value of Toi/Toc = 1, we evaluate the Lundquist number dependence for three sets of parame- 
ters: (/3c, Tij/mo) = (0.3, 100), (0.075, 400), and (0.01875, 1600). For the given parameters, the characteristic 
scale lengths of the kinetic effects are the ion sound Larmor radius pse = v?bo/"^/^i ~ 0.0141a, and the 
ion inertial skin depth di/a = 0.365, 0.73, 0.146. We also show the scaling obtained from the two-fluid MHD 
model by Fitzpatrick and Porcelli [51| for reference. For the given parameters, the two-fluid scaling is almost 
equal to that from the single-fluid model. 

For the small /3e case, we observe that the gyrokinetic scaling is close to that from the fluid model as 
expected. However, as /3e is increased, the scaling deviates and the growth rate decreases as kinetic effects 
become non-negligible. The major difference between the two-fluid model and the kinetic simulation using 
AstroGK originates from the treatment of the second-order velocity moment (temperature or pressure). The 
fluid model assumes adiabatic ions and isothermal electrons, whereas the gyrokinetic pressure is generally 
tensorial and thus contains far richer physics (soj . 

4-5. Orszag-Tang vortex problem 

To validate that results of AstroGK for a nonlinear electromagnetic problem, we present here the well- 
known MHD vortex problem of Orszag and Tang (sij . Their original simulation solves incompressible 
reduced MHD equations for the stream function ip (= —(j)/Bo) and flux function ?/> (= A\\) defined in the 



* The initial distribution function of electrons in total becomes; 

Sf. = ^V^\hc X;(fc±rfo)'(A;|\ + eA°^5(fc,a = 0, kya = 0.?.))e'^l''l/^e"^^-^'^ , (72) 

where is a Fourier representation of 1171 II . the second term gives a small sinusoidal perturbation with e being a small 

constant, and 5 is the Dirac's delta function. 
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plane perpendicular to the mean magnetic field: 



^Vlip + {ip,Vl^} = 

ot nonoiirii 

at fiQ 



{VsViV'} + 



noirrii 



(73) 
(74) 



where ^ and 77 are the viscosity and resistivity. The nonlinear evolution of a system governed by the reduced 
MHD equations provides a useful comparison for the nonlinear evolution of the GK-M equations using 
AstroGK. Since GK-M equations in the large-scale limit, k±pi ^ 1, simplify to the equations of reduced 
MHD the results of AstroGK in this limit should be similar to the results of a reduced MHD code, which 
does not contain the small scale physics that occurs when k±pi > 1. 
Given an initial condition: 



ip — — 2avA± cos 27r 



cos 2tt- 



L, 



X y 

ip =aB±o { cos47r- h2cos27r— ) , 

Lx Ly ^ 



(75) 
(76) 



where va± = B / ^JjMyn^yjn[ and = Ly = 27ra, we have made a simulation of the initial value problem 
using an independent reduced MHD code. The initial conditions clearly define the standard MHD units for 
normalization: namely, system size a and Alfven transit time ta = a/vA_L- The dimensionless dissipation 
parameters are the Reynolds number Rc = avAj_nQmi/ fj. and the Lundquist number S = fioavA^/Ti. We use 
{Nx,Ny) = (256,256) grid points with the dissipation coefficients 1/Rc ~ 1/S = 1.5 x 10"'^. 

For the AstroGK simulation, we set initial conditions on the distribution function for each species hs to 
give the corresponding fields. Therefore, we define the distribution functions by 



X y \ 

hi =C0 ( cos27r— +cos27r— /oi, 

--Ca, ( 2cos47r^ -f cos27r^ ) V^|/oe, 

Lx Ly J 



(77) 
(78) 



where and are coefficients chosen such that the resulting and A\\ are equivalent to ([75)1 and ((75)) . 
We use {Nx.,Ny,Nx,NE) = (256,256,32,16) points, Toi/Too = 1, Wi/mo = 1836, A 10"^, Pi/a = 0.01, 
and ignore collisions. We assume 8B\\ = 0. 

Time evolutions of various energies are shown in Fig. |S] for both codes. Time and the energy are 
normalized to the MHD units, i.e. ta and B\qI The result from AstroGK simulation also shows the 
gyrokinetic energy: 



GK 



— LxLy ^ 
= LxLy ^ 



E 



2/( 



Os 



Vos 



2T, 



E 



2Tf 



Os 



2fio 



(i-ro.)|0fc,| 



2 , fcjl^ii./^J^ 

2/zo 



(79) 



normalized into the MHD units. The kinetic energy {Ek = ^xLy^^.^ (noimi/2) \k±(pk^\'^) and magnetic 
energy (£'m = LxLyJ^k^^ l^xV^fci P/(2/^o)) evolve similarly in the two models. It is noted that the leading 
contribution of the ion part of the second term in (j79p yields the kinetic energy. Thus the difference between 
the gyrokinetic energy Eqk and the MHD energy E'mhd = Ek + Eu comes from and the electron part 
of the second term in ()79p . The apparent agreement of Eqk and Emhb in the initial phase is not strange 
since the energy contained in the fields is much bigger than that in g by a factor of /3^^ ■ This is true for the 
Alfvenic dynamics — the pure linear Alfven wave propagates with approximately zero g Q. 
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Figure 8: Time evolution of various energies. Red solid lines represent values from AstroGK and blue dashed from reduced 
MHD simulations. Note that -Eqk is only defined for AstroGK simulation. 
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Figure 9: Stream and flux functions at t/r^ = 1 taken from reduced MHD (left) and AstroGK (right) simulations. 
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Contour plots of the stream and flux functions arc also shown in Fig. [HI The overall agreement is almost 
perfect, which implies that AstroGK correctly reproduces MHD results. The small differences in these plots, 
especially at the small scales, are due to the kinetic effects resolved by AstroGK and are therefore physically 
meaningful. A more thorough discussion of these differences will be discussed elsewhere. 
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5. Performance 

In this section, we determine the scahng of AstroGK with each of the problem dimensions and evaluate the 
strong and weak scalings of the parallel performance. The scaling performance tests in this section employ 
a nonlinear simulation of driven turbulence with plasma parameters Pi = Tqi/Toc = noi/noc = —Qi/qc = 1, 
nii/mc = 1836. The simulation is stirred by an antenna at the smallest wavenumber in the box corresponding 
to k±pi = 1 and collisions are turned off. 

5.1. Single processor scaling with problem dimensions 

We can determine the scaling of the time per step in AstroGK as each of the dimensions of the problem 
is increased. These tests are performed on a single processor to eliminate the requirement for communi- 
cations between processors. We begin with a small nonlinear run with the following problem dimensions: 
{Nx, Ny, Nz, N\, Ne, Ns) — (4, 4, 8, 4, 2, 2). To test the scaling of the time per computational step for a given 
dimension, we increase only that dimension successively by a factor of two until the problem is too large for 
available memory; the time per step is measured for each of these runs. The results are presented in Fig. 1101 
The Nx and Ny dimensions scale asymptotically as A^logA^, as expected for fast Fourier transforms. The 
timestep is expected to scale with the number of grid points along the mean magnetic field as iV^ because 
of the field solver. However, for practical problem sizes {Nz < 1000), the field solver is still subdominant 
compared with the gyrokinetic solver. Therefore, we observe N^ dependence with c < 2. In each of the 
dimensions Nx, Ne, and Ns, the scaling is linear as anticipated. Thus, the wallclock time per step on a 
single processor scales as 

istep (X {Nx\ogN,){Ny\ogNy){N'z){Nx){NE){N,). (80) 



5.2. Parallel performance scaling 

Parallel performance of AstroGK is measured by taking the weak and strong scalings: The weak scaling 
is probed by holding the computational work per processing core constant while the number of cores, thus 
the total problem size, is increased. On the other hand, the strong scaling is probed by holding the problem 
size constant while the number of processing cores is increased. Both tests are performed on Kraken Cray 
XT5 system at the National Institute for Computational Sciences at the University of Tennessee. Kraken 
consists of 8256 compute nodes each having 12 processing cores, resulting in 99,072 compute cores in total. 

The number of grid points is chosen such that parallelization is achieved efficiently in the layout 'yxles'. 
In fully developed kinetic turbulence, fine structure develops in velocity space as well as in position space, 
thus, it is required to take the same order of grid points in both spaces. The referenced maximum number 
of total grids is 256^ x 128^ for the current highest resolution runs [28[. Here, however, position-space 
resolution is taken relatively small in the strong scaling because of the memory requirement for small 
number of processors. 

5.2.1. Weak scaling 

The initial problem uses {N^, Ny, Nz, N\, Ne, Ng) = (160,160,36,6,4,2) on 12 processing cores. Each 
time the processing core count is doubled, the problem size is doubled by alternately doubling first Ne 
and then N\ — since both of these dimensions scale linearly with the problem size, doubling one of these 
dimensions effectively doubles the computational work, leading to fair assessment of the weak scaling. The 
weak scaling behavior of the wallclock time per step istcp vs. the number of processing cores up to A^pioc ~ 
12,288 is plotted in Fig. [TTJ AstroGK follows the ideal scaling until A^proc ~ 12,288 with slight degradation 
of performance (~ 5%) due to the increase of communication for A'pi-oc > 1000. The layout specified for the 
parallel communication for this test is 'yxles'. 
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Figure 10: Single processor scaling of the time per step tstep vs. N, where corresponds to; either the number of grid points 
Nx (crosses) or Ny (squares) in the top panel, Nz in the middle panel, and either A'';^ (crosses) or A^^; (squares) or A''^ (circles) 
in the bottom panel. The asymptotic scalings are achieved for Nx,y as NlogN (top), and for a s A" (bottom). For Nz 
scaling (middle), the asymptotic scaling is not achieved. The field solver scales as N^, but is still subdominant for the practical 
problem size. 
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Weak Scaling @ Kraken [Cray XT5] 
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Figure 11: Weak scaling of AstroGK determined by holding the computational work per processor constant while the number 
of processors is increased. The time per step istcp vs. the number of processing cores Afproc is plotted. 



5.2.2. Strong scaling 

The dimensions of the nonhnear turbulence problem employed for this scaling are {N^^Ny, Nz, Nx, Ne, Ng) = 
(32,32,24, 192,256,2). The strong scaling behavior of the wallclock time per step istop vs. the number of 
processors A'^proc is plotted from Np,-oc = 48 to A^proc = 98,304 in Fig. [12] Again, the layout specified for the 
parallel communication is 'yxles'. 

To accommodate this large computational problem on a small number of processors requires more mem- 
ory per core than is available when all 12 cores on a compute node are used. Therefore, for the lowest four 
data points on the scaling curve (up to 384 processors), only 1, 2, 4, and 8 core(s) are utilized per node. The 
rest of the runs all utilize 12 cores per node. As the number of cores per node increases, the computation 
time deviates from the ideal linear scaling ("Ideal (48)" line in the figure). The sharing of communication 
and memory bandwidth between multiple cores lead to a factor of two degradation of performance. If the 
number of cores per node is fixed at 12, we observe a nearly ideal strong scaling from A'pioc = 384 up to 
Npioc = 24,576, as indicated by "Ideal (384)" line in the figure. Significant performance loss occurs only at 
Nproc = 49,152. 
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Strong Scaling @ Kraken [Cray XT5] 
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Figure 12: Plot of strong scaling taken by increasing the number of processors for the fixed problem size. The time per step 
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6. Summary 

We have presented detailed descriptions of the gyrokinetic-Maxwell equations solved in AstroGK and the 
algorithms adopted in the code. It employs an unconditionally stable implicit method for the linear terms 
and a third-order explicit multistep method for the nonlinear term. To reduce the computational cost for 
the implicit solve of the linear term, it utilizes the Godunov splitting method. Together with the Green's 
function approach developed by Kotschenreuther et al., AstroGK solves an Nz x Nz size linear system at 
each velocity grid point for each species. For collisionlcss runs, the overall accuracy of the algorithm is 
second order in A< and second order in AZ, with spectral convergence for the velocity-space integration; 
for coUisional runs, the velocity-space derivatives lead to a drop in the overall accuracy to first order in At 
and first order in the velocity-space integration. 

The computational cost on a single processor follows the theoretical scaling except for the Nz dependence. 
The time per computational step is expected to follow 7V| because of the field solver for large Nz- However, 
for practical problem sizes Nz ^ 1000, the cost of the gyrokinetic solver is still larger or comparable with 
that of the field solver, and we do not observe the asymptotic scaling. Excellent parallel performance has 
also been demonstrated for both weak and strong scaling tests, showing the ideal scaling up to about 10,000 
processors. 

The algorithms at the heart of AstroGK are the same as those in GS2, but the AstroGK code has been 
streamlined by the removal of the magnetic geometry and trapped particle effects. Therefore, it is opti- 
mized for the study of fundamental, low-frequency kinetic effects in simple plasma geometries and for the 
exploration of the dynamics in many astrophysical plasmas of interest, in which the mean magnetic field at 
Larmor radius scales can often be well approximated as straight and uniform. 

AstroGK is also an ideal developmental testbed, both for novel computational approaches and application 
to new physical plasma systems. One idea currently under development is the treatment of electrons as a 
fluid, since certain situations are indifferent to the kinetic behavior of the electrons. For instance, if one 
focuses on ion kinetic effects at scales k^pi ~ 1, the large ion-to-electron mass ratio rrii/mc 3> 1 may lead 
to negligible electron kinetic effects in some cases. In [3|, the isothermal electron fluid equations are given. 
Implementation of this alternative treatment of the electron dynamics in AstroGK will lead to dramatic 
improvements in computational speed for the study of ion kinetic effects. 

One important aspect of AstroGK is its portability. It is designed to work on a wide range of computing 
environments, from individual desktop PCs to petascale supercomputers, and new users can easily port it 
to their particular computing environment. It also supports major FORTRAN 95 compilers. The potential 
drawback of this emphasis on portability is that it is not necessarily optimized to any particular architectures. 
Further efforts to optimize the code will be made to enhance both serial and parallel efficiency of the code. 
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Table A.l: Symbols in the gyrokinctic-Maxwell equations and their explanations, definitions and normalizations. The subscript 
s signifies the symbol is dependent on species, and Boltzmann's constant is absorbed to give temperature in units of energy. 

Symbol Explanation Normalization 

Bq Mean magnetic field 

bo = Bq/Bq Unit vector in mean field direction 

/os Background Maxwellian dist. func. 

hs Non-Maxwellian part of dist. func. ehsfos 

(f) Electrostatic potential e(?!)(Too/go) 

j4|| Parallel component of vector potential eA\\ (wtho2oo/?o) 

Parallel component of magnetic field eSB^^Bo 

rUs Mass rhsniQ 

Qs Electric charge g^go 

nos Density of the background fios?^oo 

Tos Temperature of the background TqsTqq 

Vg Collision Frequency ^'s(t^th,o/ao) 

Lbo; Lnos^ -^Tos Scales of the background LBo,nQ^,To,o,Q 

K Curvature of the mean field i^/ao 

vth,s = \/2TQs/ms Thermal velocity ^Tbs/m^Wtho 

Wtho = \/ 2Too /toq Reference thermal velocity 

ils = \qs\Bo/ms Cyclotron frequency {\qs\/ms)^o 

Qq = qoBo/niQ Reference cyclotron frequency 

Ps = vth,s/^s Thermal Larmor radius ^\/TOsTbs/|gs|^ po 

po = ■\/2moToo/(go-Bo) Reference thermal Larmor radius 

ds = ^/ms/{ponosq'^) Inertial skin depth ^/rhs/ {(iofiQsqDpo 

= 2ponosTos/BQ Plasma beta nosTos/3o 

/3o = 2/ionooToo/^o Reference plasma beta 

as ~ k±V±/ils Argument of the Bcssel functions 

bg = {k±ps)'^/2 Argument of r„ 

Pq Vacuum permeability 



Appendix A. Symbols, definitions, coordinate, and normalization 

In this Appendix, we present complementary explanations of the symbols, coordinate systems, and 
normalizations used in Section [5J Table lA.ll lists the symbols and their explanations, definitions, and 
normalizations. 

Appendix A.l. Coordinate 

In gyrokinetics, it is convenient to describe dynamics of the distribution function in the gyro-center 
coordinate {Rs, Vs) rather than the particle coordinate {r,v). The following equations: 

Rs^r + Vs = V, (A.l) 

define linear transformation of the particle coordinate to the gyro-center coordinate, called the Catto trans- 
form 0. We consider the Cartesian coordinate in r and Rs '. 

r^xx + yy + zz, R, =X,X, +Y,Y, + Z^Z,, (A.2) 

where x, y and -X'^, are unit vectors that span the plane perpendicular to the mean field in z = Z^. 
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Derivatives with respect to the position coordinate are equivalent (d/dr ~ d/dRg), while those with 
respect to the velocity coordinate are different, which appears in the collision operators (See [Appendix B] 
and Refs. (Ulli). 

The velocity coordinate is written in two different ways in the text. In the gyrokinetic equation, we 
mainly use the polar coordinates (V|| ,V±,Q): the relations to the Cartesian components are given by 

K 



V±=JV,^ + Vl T^|=y„ tane=-^, (A.3) 



and I V| = F = yV± ^ convenient to write in energy and pitch-angle coordinates when the collision 

operator and velocity space integrals are considered: 



E-Vl + V,f, A=^. (A.4) 



Appendix A. 2. Normalization 

The GK-M equations solved by AstroGK are cast into dimensionless form through the normalization of 
all quantities with respect to the parameters of a reference species (denoted by the subscript 0), the mean 
magnetic field strength i?o, and the parallel length scale oq. 

The presence of a mean magnetic field leads to different characteristic temporal and spatial scales in the 
parallel and perpendicular directions. The length scale in the perpendicular plane is characterized by the 
thermal Larmor radius of the reference species, po- The ratio of the perpendicular and parallel scales defines 
the small expansion parameter e = po /qq ^ 1 in gyrokinetic theory [l[ Q . Denoting normalized quantities 
with the "hat" symbol, we define the normalized length scales in perpendicular and parallel directions by 
kj_ = k±po and fc|| = fc||ao. The time scale is normalized by the thermal crossing time of the reference species 
in the parallel direction, t = t/{ao/vtho)- 

Species dependent quantities retain their species subscript s after normalization to the reference species, 
for example mass rhs = mg/mo. The two-dimensional velocity space of distribution function employs a 
species dependent normalization so that integrations over velocity space remain efficient even when char- 
acteristic thermal velocities of the plasma species differ by a large factor. The coordinates used in velocity 
space by AstroGK are the energy Es = (l/2)msU^ and pitch angle = vj_ ^/{v'^Bq), which are related to 
the magnetic moment by XsEs ~ msv\ g/(2BQ). Normalizing the velocity to the species thermal velocity 

Vs = v/vth^s, the dimensionless velocity space coordinates are given by Es = ^1 and As = sl^l- 

The first-order fiuetuating quantities in the GK-M equations are the distribution function for each species 
f/s and the electromagnetic field variables: the scalar potential </>, the parallel component of the vector 
potential A\\, and the parallel component of the magnetic field 5B\\. The distribution function is normalized 

by 

9s = , (A.5) 

JOs Po 

where /o,,//oo = "Os exp (--0^) / (j^^^'^i'th.s^ '^ith /oo = "-oo/wtho- The fields are normahzed by 

0= — 77^, A\\^—vtw^^, 5B\\= — ^. (A.6) 

Po J 00 Po -too Pa i'o 

Notice that the dimensionless normalizations for all fluctuating, first-order quantities are also multiplied by 
a factor aa/pa so that all normalized terms have unity order of magnitude. 

An illustration of the normalization of velocity space integrals in Maxwell's equations follows for the inte- 
gral / gsdv. To normalize the integral, we multiply by 1/ {v^i^Qfoo)iO'o/po) to obtain nos / ^6^"^ /tt"^/^^ gsdv^. 
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Appendix B. Model collision operator 

In this Appendix, we present the model collision operator employed in AstroGK. This complements the 
overview of the numerical implementation given in Section 13.3.41 

AstroGK uses the model Fokker-Planck collision operator given in [sH, [3^, which includes the effects of 
pitch-angle scattering and energy diffusion while satisfying Boltzmann's ff-Theorem and conserving particle 
number, momentum, and energy. Upon gyro-averaging, the same-species collision operator is written in the 
spectral representation as 



where 



and 



CkAhk^) = C^ihkJ +CB{hkJ +ULihkJ +UBihkA, (B.l) 

Cd(I.*J = ~ (-iK"/""').*/.!;^) - ".("/".k)^ (1 - e) h,,. (B.3) 

are the gyro-averaged Lorcntz and energy diffusion operators, respectively. Together, these form the exact 
test-particle piece of the linearized Landau operator. The velocity-dependent collision frequencies ^'D and 
:/|| are given by 

with = exp(— y^)dy the error function, G{x) = — xd^/dx) /(2x^) the Chandrasekhar 

function, and ly = \/2TmQq'^ In A/ ^m^/^TQ^^^ the same-species collision frequency, which is an input param- 
eter. 

The test-particle operator given above docs not conserve particle momentum and energy, so the additional 
terms Ui^ and Ud are added to recover conservation properties. Care is taken in choosing the form of these 
conservation terms so that Boltzmann's _ff-Thcorcm is respected. With these constraints, one obtains: 

J.,, . j VTiV\\Ma)hk^dv , ^ ^ , ^ viyv^Ji{a)hk^dv \ 

othihkj = t'D/o Jo{a)vn 7 J—: \- Ji{a)v±_ ^— , (B.5) 

\^ J lyovpodv J vuvpodv J 

and 

7, /, N A f ( T f \ J ^i^v\\Ma)hk^dv J A,yv±Ji{a)hk^dv \ 

HuihkJ ^-Aufo Jo a W|| rr- — j— + Ji{a)v±_ ^r- — j— 

\ J Aiyvpodv J Aiyv^odv J 

+ Jo[a)fo- f 7-Fl, ' (^-6) 

where the additional collision frequencies Av and ve are defined as 

Av = i^u-2{v/va,fiy\^, (B.7) 
i^E = - + 2Av) . (B.8) 

The terms Ui^ and Ud are treated separately in AstroGK so that the Lorentz and energy diffusion operators 
can be split with the conservation properties and iJ-Tlieorem maintained within each splitting. When 
combined, these conserving terms constitute an approximation to the field-particle piece of the linearized 
Landau operator. 



36 



The effect of ion-electron collisions are neglected in AstroGK because they are small in the electron-ion 
mass ratio. However, electron-ion collisions are comparable in size to same-species collisions, so they are 
retained. Consequently, the electron collision operator has the following additional term: 

C^'{hk^,c) = Cl\hk,,c) + iy^'^^T^M(^c)hc, (B.9) 

"th,o 

where ; is the perturbed ion parallel flow velocity, and C£' and j/^ are obtained from their same-species 
counterparts by replacing the same-species collision frequency v with the inter-species collision frequency. 
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Appendix C. Laplace Fourier solution for driven gyrokinetics 

Wc derive a Laplace-Fourier solution for time dependence of the j4|| amplitude for a driven gyrokinetic 
system in this Appendix. The solution presented here is for linear, collisionlcss gyrokinetics without SB\\ , 
the nonlinear term, and the linear driving terms, and with an external driving force given by 



^antenna J ^||0^ i ^ u -^-^ 



n ~ \ t < 

We also set rioi/^-Oc ~ ^Qi/Qc = 1 for simplicity. Given the driving parameters used in the code, the solution 
should agree with a result obtained from the code without free parameters. 

Performing a Laplace transform in time and a Fourier transform in space on this system of equations, we 
can solve for the Laplace-Fourier transformed distribution function for the driven Fourier mode (without 
the species index s): 



~ 9k^„ (0) w ^ 

5fc_Lo - — -T7 — — - —Jo(ao) 
P + iK||o 1^1 To 



p + ifciioMi p + ikuoVu 



(C.2) 



where ao = /j^oKl/O. Setting the initial conditions to zero, <?fc^o(0) = ^||_fe^(,(0) = 0, and substituting into 
Maxwell's equations, we obtain: 

where we have used the following definitions to simplify the notation: 

E\\ =0fc^o , ^11 = — , (C.4) 

H=J2^il + TosC^iO) , A- = ^ (1 - Fo.) , (C.5) 

^ — ^ J-Os J-Qs 

s s 

-JpA^^ _^ i^^ 

P + 1^0 fc|tO^'A 



^io = (^-L0/Oi)^/2, VA ~ Bq/ y/fionoimi, ~ ip/(fc||oWth,s), and ^ is the plasma dispersion function (47[. The 
dispersion relation: 

p' + [Qipf=p' + ^J^-0 (C.7) 

leads to the Alfven wave solutions. 

Now we will focus on the solution for A^^ j^^^lt). The Laplace-Fourier solution is 

A,k^M = ^9^P)^ . (C.8) 

(p2 + [Q(p)]2)(p + i^o) 

To proceed further, we make the approximation: 

+ c^ip + iuji)ip + itj2), (C.9) 

where a;i_2 are the complex cigenfrequcncies independent of p; we know for this system these solutions 
typically have the form loi = Wr + iwj, UJ2 = —'^r + iwi with a negative growth rate Wi < 0. With this 
simplification, the inverse Laplace transform is easily found by application of the residue theorem to find: 



A\Uk^oit) _ [Q(-ic^o)]'e--ot [Q(-iwi)]2e--i* , [Q(-ic^2)] 



2 



e 



-lU}2t 



Ano {UJI - UJo){u!2 - UJq) (wo - Wi)(w2 - Wl) {uJa - UJ2){UJ1 - UJ2)' 



(C.IO) 



Note that the second and third term will decay with time. This solution is computed numerically for 
comparison to code results, as presented in Section 14.1.11 
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